Refining multiple sequence alignments with conserved core regions

نویسندگان

  • Saikat Chakrabarti
  • Christopher J. Lanczycki
  • Anna R. Panchenko
  • Teresa M. Przytycka
  • Paul A. Thiessen
  • Stephen H. Bryant
چکیده

Accurate multiple sequence alignments of proteins are very important to several areas of computational biology and provide an understanding of phylogenetic history of domain families, their identification and classification. This article presents a new algorithm, REFINER, that refines a multiple sequence alignment by iterative realignment of its individual sequences with the predetermined conserved core (block) model of a protein family. Realignment of each sequence can correct misalignments between a given sequence and the rest of the profile and at the same time preserves the family's overall block model. Large-scale benchmarking studies showed a noticeable improvement of alignment after refinement. This can be inferred from the increased alignment score and enhanced sensitivity for database searching using the sequence profiles derived from refined alignments compared with the original alignments. A standalone version of the program is available by ftp distribution (ftp://ftp.ncbi.nih.gov/pub/REFINER) and will be incorporated into the next release of the Cn3D structure/alignment viewer.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Method of Multiple Protein Sequence Alignment Using a Hybrid Approach

Multiple protein sequence alignment is an extension of pairwise alignment to incorporate more than two sequences at a time. Multiple protein sequence alignment methods try to align all of the sequences in a given query set. Multiple protein sequence alignments are often used in identifying conserved sequence regions across a group of sequences hypothesized to be evolutionarily related. Many app...

متن کامل

SeqFIRE: a web application for automated extraction of indel regions and conserved blocks from protein multiple sequence alignments

Analyses of multiple sequence alignments generally focus on well-defined conserved sequence blocks, while the rest of the alignment is largely ignored or discarded. This is especially true in phylogenomics, where large multigene datasets are produced through automated pipelines. However, some of the most powerful phylogenetic markers have been found in the variable length regions of multiple al...

متن کامل

Non-sequential structure-based alignments reveal topology-independent core packing arrangements in proteins

MOTIVATION Proteins of the same class often share a secondary structure packing arrangement but differ in how the secondary structure units are ordered in the sequence. We find that proteins that share a common core also share local sequence-structure similarities, and these can be exploited to align structures with different topologies. In this study, segments from a library of local sequence-...

متن کامل

Dynamic use of multiple parameter sets in sequence alignment

The level of conservation between two homologous sequences often varies among sequence regions; functionally important domains are more conserved than the remaining regions. Thus, multiple parameter sets should be used in alignment of homologous sequences with a stringent parameter set for highly conserved regions and a moderate parameter set for weakly conserved regions. We describe an alignme...

متن کامل

Molecular cloning of adenylate kinase from the human filarial parasite Onchocerca volvulus

Adenylate kinases (ADK) are ubiquitous enzymes that contribute to the homeostasis of adeninenucleotides in living cells. In this study, the cloning of a cDNA encoding an adenylate kinase from the filariaOnchocerca volvulus has been described. Using PCR technique, a 281 bp cDNA fragment encoding part ofan adenylate kinase was isolated from an O. volvulus cDNA library. Use of this fragment as a p...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Nucleic Acids Research

دوره 34  شماره 

صفحات  -

تاریخ انتشار 2006